Materials for “ Statistical - Computational Phase Transitions in Planted Models : The High - Dimensional Setting ”
نویسندگان
چکیده
We provide the proofs for the theorems in the main paper. 1 Proofs for Planted Clustering In this section, Theorems 1–6 refer to the theorems in the main paper. Equations are numbered continuously from the main paper. We let n1 := rK and n2 := n − rK be the numbers of nonisolated and isolated nodes, respectively. 1.1 Proof of Theorem 1 The proof relies on information theoretical arguments and the Fano’s inequaliy [4]. We use D (Ber(p)‖Ber(q)) to denote the KL divergence between two Bernoulli distributions with mean p and q. We first state an upper bound on D (Ber(p)‖Ber(q)), which is used later in the proof: D (Ber(p)‖Ber(q)) = p log p q + (1− p) log 1− p 1− q (a) ≤ pp− q q + (1− p) − p 1− q = (p− q)2 q(1− q) , (16) where (a) follows from the inequality log x ≤ x− 1,∀x ≥ 0. Let P(Y ∗,A) be the joint distribution of Y ∗ and A when Y ∗ is sampled from Y uniformly at random and A is generated according to the planted clustering model. Because the supremum is lower bounded by the average, we have inf Ŷ sup Y ∗∈Y P [ Ŷ 6= Y ∗ ] ≥ inf Ŷ P(Y ∗,A) [ Ŷ 6= Y ∗ ] . (17) Let H(X) be the entropy of a random variable X and I(X;Z) the mutual information between X and Z. By Fano’s inequality, we have for any Ŷ , P(Y ∗,A)(Ŷ 6= Y ∗) ≥ 1− I(Y ∗;A) + 1 log |Y| . (18) Simple counting gives that |Y| = ( n n1 ) n1! r!(K!)r . Note that ( n n1 ) ≥ ( n n1 ) n1 and √ n(ne ) n ≤ n! ≤ e √ n(ne ) n. It follows that |Y| ≥ (n/n1) √ n1(n1/e) n1 e √ r(r/e)rerKr/2(K/e)n1 ≥ ( n K )n1 1 e(r √ K)r .
منابع مشابه
Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting
The planted models assume that a graph is generated from some unknown clusters by randomly placing edges between nodes according to their cluster memberships; the task is to recover the clusters given the graph. Special cases include planted clique, planted partition, planted densest subgraph and planted coloring. Of particular interest is the high-dimensional setting where the number of cluste...
متن کاملStatistical-Computational Tradeoffs in Planted Models: The High-Dimensional Setting
The planted models assume that a graph is generated from a set of clusters by randomly placing edges between nodes according to their cluster memberships; the task is to recover the clusters given the graph. Special cases include planted clique, planted partition and planted coloring. This paper studies the statisticalcomputational tradeoffs of these models. Our focus is the high-dimensional se...
متن کاملSharp Computational-Statistical Phase Transitions via Oracle Computational Model
We study the fundamental tradeoffs between computational tractability and statistical accuracy for a general family of hypothesis testing problems with combinatorial structures. Based upon an oracle model of computation, which captures the interactions between algorithms and data, we establish a general lower bound that explicitly connects the minimum testing risk under computational budget con...
متن کاملشبیه سازی ذوب سیستمهای دو بعدی
The study of a two-dimensional (2-D) system started nearly half a century ago when Peierls and Landau showed the lack of long range translational order in a two-dimensional solid. In 1968, Mermin proved that despite the absence of long range translational order. Two-dimensional solids can still exhibit a different kind of long range bond orientation. During the last decade, fascinating theori...
متن کاملExperimental study and numerical simulation of three dimensional two phase impinging jet flow using anisotropic turbulence model
Hydrodynamic of a turbulent impinging jet on a flat plate has been studied experimentally and numerically. Experiments were conducted for the Reynolds number range of 72000 to 102000 and a fixed jet-to-plate dimensionless distance of H/d=3.5. Based on the experimental setup, a multi-phase numerical model was simulated to predict flow properties of impinging jets using two turbulent models. Mesh...
متن کامل